Conversation
This commit introduces several new scripts for generating changelog context and embeddings. Key additions include `extract_changelog_context.py`, which gathers repository context such as README content, module docstrings, project structure, and changelog history. The `create_changelog_embeddings.py` script generates embeddings for these context files, enhancing the efficiency of changelog entry generation. Additionally, a Dockerfile and entrypoint script are added to facilitate containerization of the application, along with a .dockerignore file to manage ignored files during the build process. GitHub workflows for testing and updating changelogs are also included, streamlining the CI/CD process for changelog management.
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR introduces several new scripts to automate the generation and visualization of changelog entries by extracting repository context, generating context embeddings, and integrating the process into CI/CD pipelines. Key changes include:
- Addition of Python scripts for generating HTML changelogs, changelog entries, context extraction, and embeddings.
- Enhancements to GitHub workflows for automated changelog updating and testing.
- New utility scripts for token counting and embedding statistics to optimize API usage.
Reviewed Changes
Copilot reviewed 10 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/generate_changelog_html.py | Generates a visual HTML changelog with charts created using matplotlib. |
| scripts/generate_changelog_entry.py | Creates a markdown changelog entry from PR data with optional OpenAI integration. |
| scripts/extract_changelog_context.py | Extracts various repository context elements, such as README content, module docstrings, and project structure. |
| scripts/create_changelog_embeddings.py | Generates embeddings for context files using OpenAI API or a simulated fallback. |
| extract_embedding_stats.py | Provides statistics on token counts and embedding dimensions. |
| create_changelog_embeddings.py | Alternative embedding generation script (root directory) using NumPy simulation. |
| count_changelog_tokens.py | Counts tokens in context files and estimates API costs. |
| .github/workflows/update-changelog.yml | GitHub workflow to automate the changelog generation process upon PR activity. |
| .github/workflows/test-changelog-scripts.yml | Workflow for integration and unit testing of the new changelog scripts. |
Files not reviewed (3)
- .dockerignore: Language not supported
- Dockerfile: Language not supported
- docker-entrypoint.sh: Language not supported
…artifact actions to version 4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit introduces several new scripts for generating changelog context and embeddings. Key additions include
extract_changelog_context.py, which gathers repository context such as README content, module docstrings, project structure, and changelog history. Thecreate_changelog_embeddings.pyscript generates embeddings for these context files, enhancing the efficiency of changelog entry generation. Additionally, a Dockerfile and entrypoint script are added to facilitate containerization of the application, along with a .dockerignore file to manage ignored files during the build process. GitHub workflows for testing and updating changelogs are also included, streamlining the CI/CD process for changelog management.